-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use fused types for overloaded function signatures #14969
Use fused types for overloaded function signatures #14969
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I believe this is the first use of fused types in our Cython
/merge |
pylibcudf.Table([col.to_pylibcudf(mode="read") for col in target_columns]), | ||
) | ||
tbl = pylibcudf.copying.scatter( | ||
pylibcudf.Table([col.to_pylibcudf(mode="read") for col in sources]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pull out the line that defines the table/scalar here and declare a separate variable for it? It is hard to read a three-line expression that just supplies one argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fair. TBH I'm not too concerned about how legacy Cython code looks right now because once we've completed pylibcudf we'll need to rewrite cudf internals substantially to leverage pylibcudf properly, so all of this will go away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would pulling that out into a variable be able to avoid the conditional cdef-ing problem? Why/why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, was this question meant for the other thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm trying to ask a related question here. I realize now it's because this is def
and the other is cpdef
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right that's why I was confused, now I see. You're on the right track; it's less about this function being a def or a cpdef, and more about what it's calling. We're calling pylibcudf as a Python API here, so it accepts arbitrary objects as input and we're not cdefing them. In the other places we have to cdef the inputs because we are calling cdef functions (C++ cdef extern functions) that require strongly typed input.
(<Column> lhs).view(), | ||
dereference((<Scalar> rhs).c_obj), | ||
lhs.view(), | ||
dereference(rhs.c_obj), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to evaluate the lhs/rhs objects ahead of time and reduce the dispatching logic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that what you're envisioning is something like (pseudocode):
if lhs is column and rhs is column:
c_lhs = lhs.view()
c_rhs = rhs.view()
elif lhs is column and rhs is scalar:
c_lhs = lhs.view()
c_rhs = dereference(rhs.c_obj)
...
result = move(cpp_binaryop(c_lhs, c_rhs, ...))
...
Is that what you're thinking?
If so, unfortunately no that doesn't work. There are two reasons:
- cdefs can't go inside conditionals. That includes conditionals that are compile-time branches based on fused types. Therefore, you'd still need separate cdefs for
c_lhs_scalar
andc_lhs_column
, for example, then you'd assign them inside the conditional block. That's at least as verbose, but one could perhaps argue that it's more DRY. However: - The C++ function call needs to resolve to the correct overload, so you have to pass the appropriately-typed (known at compile-time) objects to calling
cpp_binaryop
. So no matter what you need to invoke the function separately for each possible overload.
Ultimately the translation from Cython types (fused or not) to C++ types remains a point of friction.
Let me know if that answered your question or if you meant something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. That was my expectation but I didn't want to miss out on anything special if fused types would allow this.
cpp_copying.empty_like( | ||
(<Column> input).view(), | ||
else: | ||
source_scalars = _as_vector(source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, is it possible to reduce this dispatching logic when using fused types?
Description
This change makes the pylibcudf API more convenient and a more faithful reproduction of the underlying libcudf APIs that offer overloaded signatures. In cases like binary ops where we were previously using runtime instance checks, this change also removes unnecessary runtime overhead if the calling code is Cython since in those cases the types at the call site are known at compile time.
Checklist